Parallelize task instance gather and support --gather on Modal by AlienKevin · Pull Request #199 · SWE-bench/SWE-smith

AlienKevin · 2026-01-16T22:48:32Z

This PR introduces parallel processing to the task instance gathering phase to significantly improve performance for large datasets and adds support for the gather phase in the Modal workflow.

Key Changes:

Parallel Gathering (swesmith/harness/gather.py):
Before this PR, large repos like math.js with >800 task instances timed out after 20 minutes due to slow, sequential git branch creation and git push. After this PR, large repos finish in minutes.
- Implemented ProcessPoolExecutor to process task instances in parallel, utilizing multiple cores.
- Added unique, PID-based clone paths (e.g., repo_name_pid_subfolder) to prevent race conditions during concurrent Git operations.
- Refactored the main loop into a process_instance worker function.
Modal Support (scripts/bug_gen_modal.py):
- Support task instance gathering with a --gather CLI flag (skipping generation/validation).

Question: do we want to fix the FAIL_TO_PASS to PASS_TO_FAIL?:
swe-smith currently uses FAIL_TO_PASS for tests that pass before the bug patch but fails afterwards, which inverts the semantic and causes confusion. A more intuitive name would be PASS_TO_FAIL so I used this convention in this PR. However, if we are to adopt this new convention, the rest of the code and datasets need to be updated, so I'm not sure whether it's worth it?

Resolution: Flip PASS_TO_FAIL to FAIL_TO_PASS in alignment with SWE-bench naming convention when outputing the task instance jsons.

Test command

uv run modal run scripts/bug_gen_modal.py --language javascript --gather &> gather.log

…t take longer to gather

Previously, the script would fail if `git commit` was attempted with no changes. This was observed in cases like `Automattic__mongoose.5f57a5bb` where the applied patch resulted in no tracked changes. Now, we check `git status --porcelain` before committing and skip the instance if no changes are detected.

…elized

for more information, see https://pre-commit.ci

codecov · 2026-01-16T22:54:22Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines	Coverage Δ
swesmith/profiles/base.py	`82.82% <100.00%> (+0.09%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…ateless git ls-remote

- Switch from per-task clones to per-worker persistent repositories. - Reduces clone operations from O(tasks) to O(workers) (e.g. 1400 -> 17). - Eliminates file locking race conditions. - Total gather time for Javascript is now ~5 minutes (bottlenecked by math.js).

for more information, see https://pre-commit.ci

…gather flag

Cause: - gather invoked apply commands with a relative patch path (`../logs/run_validation/.../patch.diff`). - During modal gather, each worker runs from a temporary repo directory under `/tmp/...`, so that relative path did not resolve to the mounted logs directory. - `git apply` and fallback `patch` both failed with "can't open patch ... No such file or directory", resulting in dropped instances and empty/underfilled task outputs. Fix: - Resolve `patch.diff` to an absolute path before apply. - Shell-quote that absolute path and pass it to every command in `GIT_APPLY_CMDS`. Result: - Patch application no longer depends on worker cwd; gather can apply valid rust patches and produce task instances consistently.

Root cause: upload_tasks_to_hf_modal.py was hardcoded to javascript paths in both task discovery and per-repo processing. Running with --language rust still read /data/javascript/... and javascript/task_insts, which breaks Rust upload workflows and can surface as missing/empty problem statements for non-JS datasets. Fix: thread a language argument through the local entrypoint and worker function, list files from {language}/task_insts, and pass language explicitly through process_repo.map so each worker reads /data/{language}/task_insts and /data/{language}/issue_gen.

for more information, see https://pre-commit.ci

…/bug-gen-gather

for more information, see https://pre-commit.ci

# Conflicts: # scripts/bug_gen_modal.py

…/bug-gen-gather

for more information, see https://pre-commit.ci

…/bug-gen-gather

AlienKevin and others added 8 commits January 16, 2026 08:56

Support --gather in bug_gen_modal.py

0d77bc7

Update --gather to store to /logs/{language}/task_insts

1465908

Only write out json if task instances is not empty

5d0cf3d

Doubled modal sandbox time out to 20 minutes to account for repos tha…

96efde4

…t take longer to gather

feat: parallelize gather.py and fix thread safety

d065420

Reset MODAL_TIMEOUT back down to 10 minutes now that gather is parall…

84f8587

…elized

[pre-commit.ci] auto fixes from pre-commit.com hooks

67a8529

for more information, see https://pre-commit.ci

AlienKevin added 4 commits January 16, 2026 19:46

Replace slow and stateful git checkout and branch -D with a single st…

842650a

…ateless git ls-remote

Cache repo locally to avoid rate limits and speed up cloning

af01d5c

Flip PASS_TO_FAIL to FAIL_TO_PASS following SWE-bench naming convention

e42a5e2

AlienKevin force-pushed the kevin/bug-gen-gather branch from 8cefa3c to e42a5e2 Compare January 17, 2026 07:35

Remove unused shutil import in gather.py

302b73e

AlienKevin force-pushed the kevin/bug-gen-gather branch from 98898a8 to 302b73e Compare January 17, 2026 07:39

pre-commit-ci bot and others added 5 commits January 17, 2026 07:39

[pre-commit.ci] auto fixes from pre-commit.com hooks

393aba4

for more information, see https://pre-commit.ci

First draft for issue gen

1f0601e

Support PortKey for issue gen and switch to gpt-5-mini

a50e614

Uncomment gather part in bug_gen_modal.py

9461c1f

Add script to upload task instances to Hugging Face

f7f68cb

AlienKevin force-pushed the kevin/bug-gen-gather branch from 52ba582 to f7f68cb Compare January 20, 2026 08:20

pre-commit-ci bot and others added 8 commits January 20, 2026 08:20

[pre-commit.ci] auto fixes from pre-commit.com hooks

4976d85

for more information, see https://pre-commit.ci

Refactor bug generation phases to use --phases argument instead of --…

2ab21d8

…gather flag

Merge remote-tracking branch 'upstream/main' into kevin/bug-gen-gather

6f514a3

Make Modal HF upload robust with single remote path

9c58d47

[pre-commit.ci] auto fixes from pre-commit.com hooks

3b9fe1d

for more information, see https://pre-commit.ci

Merge branch 'main' into kevin/bug-gen-gather

b13ca66

AlienKevin and others added 14 commits February 27, 2026 18:52

Support backfilling patch diffs

0a61157

Relax gather timeout to 1 hour

b2893ee

Merge upstream main

6c6469b

Merge remote-tracking branch 'upstream/main' into kevin/bug-gen-gather

f8e9455

Merge remote-tracking branch 'origin/kevin/bug-gen-gather' into kevin…

17f5365

…/bug-gen-gather

[pre-commit.ci] auto fixes from pre-commit.com hooks

e36f583

for more information, see https://pre-commit.ci

Add modal helpers for issue-task uploads

0ea5927

Merge remote-tracking branch 'upstream/main' into kevin/bug-gen-gather

12c37d4

# Conflicts: # scripts/bug_gen_modal.py

Merge remote-tracking branch 'origin/kevin/bug-gen-gather' into kevin…

f337083

…/bug-gen-gather

[pre-commit.ci] auto fixes from pre-commit.com hooks

3ce0bd2

for more information, see https://pre-commit.ci

Fix Ruff issues in modal scripts

172ca18

Merge remote-tracking branch 'origin/kevin/bug-gen-gather' into kevin…

2ba2306

…/bug-gen-gather

Remove unused import in bug gen modal

5424d12

Remove unused modal helper scripts

2de1c69

AlienKevin merged commit 9f2ba94 into SWE-bench:main Mar 9, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelize task instance gather and support --gather on Modal#199

Parallelize task instance gather and support --gather on Modal#199
AlienKevin merged 40 commits intoSWE-bench:mainfrom
AlienKevin:kevin/bug-gen-gather

AlienKevin commented Jan 16, 2026 •

edited

Loading

Uh oh!

codecov bot commented Jan 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AlienKevin commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test command

Uh oh!

codecov bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AlienKevin commented Jan 16, 2026 •

edited

Loading

codecov bot commented Jan 16, 2026 •

edited

Loading